4a. Identify new tagging events
First create a new column to identify newly caught animals based on “date distance” (# days since tag was last recorded). I did this because the recap column that should identify when new animals are caught is not reliable. Tags are recorded as “new” if…
It is the earliest record of the tag (first time tag is ever recorded)
The tag was recorded more than a year (date distance >= 730 days) from previous tag observation
Assumption: If an animal has not been recaught within 2 years, it is assumed dead or to have emigrated from the population. Thus, new observations of that tag are assumed to be a new animal.
ddFUN <- function(dd){
smTags3 %>%
#Fix tags that are duplicate tags
arrange(tag_id_new, date) %>%
group_by(tag_id_new) %>%
#create new column to identify when a new tagging event has occurred, defined by first tag event (is.na(date_dist)) and if it has been more than one year since animal was tagged compared to last tagging event
mutate(new_tag = ifelse(animal == "y" & is.na(date_dist), "y",
ifelse(animal == "y" & date_dist >= dd, "y",
ifelse(animal == "y" & date_dist < dd, "n", NA))),
dist_day = dd) %>%
#########
ungroup() %>%
arrange(new_tag, tag_id_new, date) %>% #arrange data to get new tag events together
group_by(new_tag, tag_id_new) %>% #group data
mutate(n = ifelse(new_tag == "y", n(), NA))
}
dd = c(365, 730, 1096)
smTags4_dd <- map_df(dd, ddFUN)
smTags4_dd %>%
filter(!is.na(tag_id_new)) %>%
ggplot() +
geom_bar(aes(x = n, fill = as.factor(dist_day)), position = "dodge", alpha = 0.4, color = "grey") +
xlab("# times tag used") +
scale_fill_manual(labels = c("1 yr", "2 yrs", "3 yrs"), values = c("red", "orange", "yellow")) +
labs(fill = "date distance <= ...") +
ggtitle("Identifying duplicate tags based on date distance")

The number of newly identified tagging events varies based on the cutoff in date distance used. In other words, if a tag is not recorded for 1 year and then recorded again, is that tag a new but duplicate tag or was the tag just not observed for 1 year? What about 2 years? 3 years? Depending on our cutoff, the number of new but duplicate tags varies. if we choose too short of a time, we are biasing ourselves towards more new duplicate tags (higher turnover of animals & higher recapture rate). But if we choose too high a value, we are biasing ourselves towards longer-lived and more sporadically caught animals (lower turnover of animals and lower recapture rate). What number to use? It’s impossible to know without more info…
I will choose the intermediary value, 2 years or 730 days.
ddc = 730 #date distance cutoff is 730, or 2 years
smTags4 <- smTags3 %>%
#Fix tags that are duplicate tags
arrange(tag_id_new, date) %>%
group_by(tag_id_new) %>%
#create new column to identify when a new tagging event has occurred, defined by first tag event (is.na(date_dist)) and if it has been more than one year since animal was tagged compared to last tagging event
mutate(new_tag = ifelse(animal == "y" & is.na(date_dist), "y",
ifelse(animal == "y" & date_dist >= ddc, "y",
ifelse(animal == "y" & date_dist < ddc, "n", NA))))
smTags4_display <- smTags4 %>%
dplyr::select(quarter, year, date, day_no, x, y, animal, recap, tag_id_new, date_dist, trap_dist, new_tag)
DT::datatable(smTags4_display, filter = "top")
4b. Identify duplicate tags
Duplicate tags are created using this protocol:
Create a new column (n) that calculates the number of new tag events for each unique tag ID (e.g. if the same unique tag ID was recorded as being a new tag twice (based on the “new tag” column created in 4a), n = 2)
Create a new column (tag_group) that creates a group number for each new tag event for that tag ID
Fix duplicated tags by adding 1000 or 2000 to duplicated tags:
If n = 1, tag does not get changed
If n = 2, for the first tag (tag_group == 1), 1000 is added to the tag (e.g. if the original tag is 0001, the tag is now 1001); for the second tag (tag_group == 2) remains the same
If n = 3, for the first tag (tag_group == 1), 1000 is added to the tag (0001 -> 1001); for the second tag (tag_group == 2), 2000 is added (0001 -> 2001); for the third tag, there is no change.
Tags were not used more than 3x so there is no n = 4.
smTags5 <- smTags4 %>%
#make groups within tag; e.g. group 1 for first time tag used, group 2 for second time tag used, etc.
ungroup() %>%
arrange(new_tag, tag_id_new, date) %>% #arrange data to get new tag events together
group_by(new_tag, tag_id_new) %>% #group data
mutate(n = ifelse(new_tag == "y", n(), NA), #number of new tag events for each tag_id
tag_group = ifelse(new_tag == "y", seq_along(1:n), NA)) %>% #number tag events for new tags
ungroup() %>%
arrange(tag_id_new, date) %>% #re-arrange data to get tag id's together ordered by date
fill(tag_group) %>% #fill in missing tag groups according to group id assigned to each new tag event
#make duplicate tags unique by adding 1000's to old tags
mutate(tag_no_prefix = ifelse(new_tag == "y" & n == 1, 0,
ifelse(new_tag == "y" & n == 2 & tag_group == 1, 1000,
ifelse(new_tag == "y" & n == 2 & tag_group == 2, 0,
ifelse(new_tag == "y" & n == 3 & tag_group == 1, 1000,
ifelse(new_tag == "y" & n == 3 & tag_group == 2, 2000,
ifelse(new_tag == "y" & n == 3 & tag_group == 3, 0, NA)))))),
tag_no_new = as.numeric(tag_no_new),
tag_no_prefix = as.numeric(tag_no_prefix)) %>%
fill(tag_no_prefix) %>%
mutate(tag_id_new = ifelse(!is.na(tag_prefix), tag_id_new, tag_no_prefix + tag_no_new),
tag_id_new = ifelse(!is.na(tag_prefix), tag_id_new, str_pad(tag_id_new, 4, pad = "0")), #format numbers to 4 digits
tag_id_change = ifelse(tag_no != tag_id_new, "y", "n")) #add column to indicate where tag ID has been identified as a duplicate tag and changed